Automatically Assessing Machine Summary Content Without a Gold Standard
نویسندگان
چکیده
منابع مشابه
Automatically Assessing Machine Summary Content Without a Gold Standard
The most widely adopted approaches for evaluation of summary content follow some protocol for comparing a summary with gold-standard human summaries, which are traditionally called model summaries. This evaluation paradigm falls short when human summaries are not available and becomes less accurate when only a single model is available. We propose three novel evaluation techniques. Two of them ...
متن کاملAssessing hydration status: the elusive gold standard.
Acknowledging that total body water (TBW) turnover is complex, and that no measurement is valid for all situations, this review evaluates 13 hydration assessment techniques. Although validated laboratory methods exist for TBW and extracellular volume, no evidence incontrovertibly demonstrates that any concentration measurement, including plasma osmolality (P(osm)), accurately represents TBW gai...
متن کاملGold standard in assessing baroreceptive function.
Gold Standard in Assessing Baroreceptive Function Print ISSN: 0194-911X. Online ISSN: 1524-4563 Copyright © 2004 American Heart Association, Inc. All rights reserved. is published by the American Heart Association, 7272 Greenville Avenue, Dallas, TX 75231 Hypertension doi: 10.1161/01.HYP.0000120966.82562.9
متن کاملThe Gold Standard: Automatically Generating Puzzle Game Levels
KGoldrunner is a puzzle-oriented platform game with dynamic elements. This paper describes Goldspinner, an automatic level generation system for KGoldrunner. Goldspinner has two parts: a genetic algorithm that generates candidate levels, and simulations that use an AI agent to attempt to solve the level from the player’s perspective. Our genetic algorithm determines how “good” a candidate level...
متن کاملMachine Translation for Multilingual Summary Content Evaluation
The multilingual summarization pilot task at TAC’11 opened a lot of problems we are facing when we try to evaluate summary quality in different languages. The additional language dimension greatly increases annotation costs. For the TAC pilot task English articles were first translated to other 6 languages, model summaries were written and submitted system summaries were evaluated. We start wit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Linguistics
سال: 2013
ISSN: 0891-2017,1530-9312
DOI: 10.1162/coli_a_00123